48 research outputs found

    AdaSwarm: Augmenting Gradient-Based Optimizers in Deep Learning with Swarm Intelligence

    Get PDF
    This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even better performance than the Adam optimizer adopted in neural networks. In order to support our proposed AdaSwarm, a novel Exponentially weighted Momentum Particle Swarm Optimizer (EMPSO), is proposed. The ability of AdaSwarm to tackle optimization problems is attributed to its capability to perform good gradient approximations. We show that, the gradient of any function, differentiable or not, can be approximated by using the parameters of EMPSO. This is a novel technique to simulate GD which lies at the boundary between numerical methods and swarm intelligence. Mathematical proofs of the gradient approximation produced are also provided. AdaSwarm competes closely with several state-of-the-art (SOTA) optimizers. We also show that AdaSwarm is able to handle a variety of loss functions during backpropagation, including the maximum absolute error (MAE)

    LogGENE: A smooth alternative to check loss for Deep Healthcare Inference Tasks

    Full text link
    Mining large datasets and obtaining calibrated predictions from tem is of immediate relevance and utility in reliable deep learning. In our work, we develop methods for Deep neural networks based inferences in such datasets like the Gene Expression. However, unlike typical Deep learning methods, our inferential technique, while achieving state-of-the-art performance in terms of accuracy, can also provide explanations, and report uncertainty estimates. We adopt the Quantile Regression framework to predict full conditional quantiles for a given set of housekeeping gene expressions. Conditional quantiles, in addition to being useful in providing rich interpretations of the predictions, are also robust to measurement noise. Our technique is particularly consequential in High-throughput Genomics, an area which is ushering a new era in personalized health care, and targeted drug design and delivery. However, check loss, used in quantile regression to drive the estimation process is not differentiable. We propose log-cosh as a smooth-alternative to the check loss. We apply our methods on GEO microarray dataset. We also extend the method to binary classification setting. Furthermore, we investigate other consequences of the smoothness of the loss in faster convergence. We further apply the classification framework to other healthcare inference tasks such as heart disease, breast cancer, diabetes etc. As a test of generalization ability of our framework, other non-healthcare related data sets for regression and classification tasks are also evaluated

    Interfacial control of vortex-limited critical current in type-II superconductor films

    Full text link
    In a small subset of type-II superconductor films, the critical current is determined by a weakened Bean-Livingston barrier posed by the film surfaces to vortex penetration into the sample. A film property thus depends sensitively on the surface or interface to an adjacent material. We theoretically investigate the dependence of vortex barrier and critical current in such films on the Rashba spin-orbit coupling at their interfaces with adjacent materials. Considering an interface with a magnetic insulator, we find the spontaneous supercurrent resulting from the exchange field and interfacial spin-orbit coupling to substantially modify the vortex surface barrier, consistent with a previous prediction. Thus, we show that the critical currents in superconductor-magnet heterostructures can be controlled, and even enhanced, via the interfacial spin-orbit coupling. Since the latter can be controlled via a gate voltage, our analysis predicts a class of heterostructures amenable to gate-voltage modulation of superconducting critical currents. It also sheds light on the recently observed gate-voltage enhancement of critical current in NbN superconducting film
    corecore